A new approach to identifying Chinese maximal-length phrases by combining bidirectional labeling
نویسندگان
چکیده
Chinese maximal-length phrases (maximal-length noun phrases and prepositional phrases) possess notable linguistic properties. Bidirectional labeling results of the Chinese maximal-length phrases obtained by sequential classifiers reveal the complementary properties in the two directions of Chinese sentences. In this paper, both left-right and right-left sequential labeling are used to identify the Chinese maximal-length noun phrases and prepositional phrases. Then a novel “fork position” based probabilistic algorithm is presented to combine the bidirectional results. The experiments carried on the Penn Chinese Treebank confirm that the proposed combining algorithm is able to exploit the complementary strengths of the two directions effectively.
منابع مشابه
Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM
Selecting appropriate words to compose a sentence is one common problem faced by non-native Chinese learners. In this paper, we propose (bidirectional) LSTM sequence labeling models and explore various features to detect word usage errors in Chinese sentences. By combining CWINDOW word embedding features and POS information, the best bidirectional LSTM model achieves accuracy 0.5138 and MRR 0.6...
متن کاملCFN - based Semantic Role Labeling of Chinese Prepositional Phrase ⋆
Prepositional Phrases are often among the most frequent expressions in Chinese, but they have been ignored on the grounds of being syntactically promiscuous and semantically vacuous, and relegated to the ignominious rank of “stop word”. The Chinese FrameNet (CFN) is a lexical resource project developed by Shanxi University, Taiyuan, based on the principles of Frame Semantics and supported by co...
متن کاملThe Identification and Classification of Unknown Words in Chinese An N-Grams-Based Approach
In this paper, we propose a new approach to identify unknown words in Chinese. This approach adopts an n-grams program to sort out the collocating word / character sequences which are possible words and phrases in Chinese. In addition to proposing the criteria for identifying Chinese new words, was also classify these new words according to their structural and semantic characteristics. The cor...
متن کاملAn Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملChinese Semantic Role Labeling with Bidirectional Recurrent Neural Networks
Traditional approaches to Chinese Semantic Role Labeling (SRL) almost heavily rely on feature engineering. Even worse, the long-range dependencies in a sentence can hardly be modeled by these methods. In this paper, we introduce bidirectional recurrent neural network (RNN) with long-short-term memory (LSTM) to capture bidirectional and long-range dependencies in a sentence with minimal feature ...
متن کامل